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Post-translational modifications (PTMs) refer to the covalent modifications of polypeptides after they are 
synthesized, adding temporal and spatial regulation to modulate protein functions. Being obligate intra- 
cellular parasites, viruses rely on the protein synthesis machinery of host cells to support replication, and 
not surprisingly, many viral proteins are subjected to PTMs. Coronavirus (CoV) is a group of enveloped RNA 
viruses causing diseases in both human and animals. Many CoV proteins are modified by PTMs, including 
glycosylation and palmitoylation of the spike and envelope protein, N- or O-linked glycosylation of the 
membrane protein, phosphorylation and ADP-ribosylation of the nucleocapsid protein, and other PTMs 
on nonstructural and accessory proteins. In this review, we summarize the current knowledge on PTMss of 
CoV proteins, with an emphasis on their impact on viral replication and pathogenesis. The ability of some 
CoV proteins to interfere with PTMs of host proteins will also be discussed. 
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Coronaviruses are a family of enveloped RNA viruses causing diseases in both animals and humans. Infection by 
animal coronaviruses, such as infectious bronchitis virus (IBV) and transmissible gastroenteritis virus (TGEV), 
reduces the yield and quality of domestic animals and causes great economic loss to the industry worldwide [1], 
whereas the extremely contagious mouse hepatitis virus (MHV) is presumably the most important pathogen 
of laboratory mice [2}. Human coronaviruses, such as HCoV-229E and HCoV-OC43, account for a significant 
percentage of common colds in adults [3,4]. Notably, the newly emerged, highly pathogenic human coronaviruses 
severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus 
(MERS-CoV) cause severe diseases with high mortality rates [5,6]. The same bat origin of both SARS-CoV and 
MERS-CoV suggests that coronavirus has the inherent ability to cross the species barrier to become lethal human 
pathogens. Therefore, a better understanding of the biology and pathogenesis of this family of viruses is critical in 
face of the threat of future epidemics. 

Taxonomically, the family Coronaviridae is divided into two subfamilies: Coronavirinae and Torovirinae. The 
subfamily Coronavirinae is further classified into four genera, namely Alphacoronavirus, Betacoronavirus, Gamma- 
coronavirus and Deltacoronaviruses, based on initial antigenic relationship and later genome sequence alignment [7]. 
Within the genus Betacoronavirus, four lineages (A, B, C and D) can be phylogenetically distinguished. While 
the prototypic MHV is a lineage A Betacoronavirus, SARS-CoV and MERS-CoV belong to lineage B and C, re- 
spectively. Current evidence suggests that A/phacoronavirus and Betacoronavirus may evolve from bat coronaviruses 
and later establish mammalian tropism, whereas Gammacoronavirus and Deltacoronavirus may originate from avian 
coronaviruses and thus mainly infect avian hosts [8]. 

Morphologically, coronaviruses are spherical or pleomorphic in shape with an average diameter of 80—120 nm. 
Under the electron microscope, the virions are characterized by surface projections constituted by the trimeric 
S-glycoprotein [9]. In some Betacoronaviruses, a second type of shorter projections, contributed by the homodimeric 
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HE protein, can be observed [10]. The most abundant protein in the virion is the M-glycoprotein, which embeds 
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Figure 1. Schematic diagram depicting the replication cycle of coronavirus. Coronavirus replication starts with the 
binding of the virion to the cognate cell surface receptor, which triggers the fusion between the virus envelope and 
the cellular membrane, allowing the nucleocapsid to enter the cytoplasm (attachment and entry). After uncoating, 
the genomic RNA is translated to produce pp1a and pp1ab, which are cleaved to form numerous Nsps. Some of the 
Nsps induce the formation of DMVs, on which the RTC is assembled. Both gRNA and sgRNA are synthesized via 
negative sense intermediates. The sgRNAs encode structural proteins and accessory proteins. Virion assembly occurs in 
the ERGIC. Mature virus particles are transported in smooth-walled vesicles and released via the secretary pathway. 
DMV: Double membrane vesicle; ER: Endoplasmic reticulum; ERGIC: ER-Golgi intermediate compartment; 

gRNA: Genomic RNA species; Nsp: Nonstructural protein; pp: Polyprotein; RTC: Replication transcription complex; 
sgRNA: Subgenomic RNA species. 


into the envelope and provides structural support to the virion. The E protein is a small, integral membrane protein 
present at a low amount in the virion, but it plays an essential role during virion assembly and release [11,12]. Inside 
the envelope, the helically symmetric nucleocapsid is comprised of the RNA genome closely associated with the N 
protein in a beads-on-a-string fashion. The positive sense, nonsegmented, ssRNA genome, ranging from 27,000 to 
32,000 nucleotides in size, is the largest RNA genome known to date. 

The replication cycle of coronavirus starts with the binding of the S protein to its cognate receptor(s) on the 
host cell surface (Figure 1), which triggers a conformational change in the S2 subunit and results in the fusion 
between the viral envelope and the cellular membrane, thereby delivering the nucleocapsid into the cytoplasm [9]. 
After uncoating, the genomic RNA containing a 5’-cap and a 3’-poly(A) tail is recognized by the host translation 
machinery to synthesize a polyprotein la (ppla), as well as a larger polyprotein lab (pplab) in a process involving 
ribosomal frameshifting [13]. Autoproteolytic cleavage of ppla and pplab produces 15—16 nonstructural proteins 
(nsps) with diverse functions. Among them, nsp3 and nsp5 encode the papain-like protease (PLPro) activity 
and the chymotrypsin-like main protease (MP"°) activity, respectively, whereas nsp12 encodes the critical RNA- 
dependent RNA polymerase (RdRp) activity [14,15]. In the replication/transcription complex closely associated 
with virus-induced double membrane vesicles (DMVs) or spherules, positive-sense progeny genomic RNA is 
synthesized from the negative-sense intermediate. On the other hand, a nested set of subgenomic RNA (sgRNA) 
species is synthesized by discontinuous transcription of the genome, from which structural and accessory proteins 
are translated. Transmembrane structural proteins (S, M and E) are synthesized, folded and modified in the 
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endoplasmic reticulum (ER) and transported to the ER—Golgi intermediate compartment, where they interact 
with the encapsidated genome to assemble progeny virions. At last, virions budded into the ER—Golgi intermediate 
compartment are transported inside smooth-wall vesicles and released to the extracellular milieu via the secretory 
pathway, thereby starting a new round of viral replication. Infection of some coronaviruses also causes the fusion of 
the infected cell with neighboring uninfected cells, resulting in a large multinucleated syncytium. The replication 
cycle of coronavirus is shown in Figure 1. 

Post-translational modifications (PTMs) are the covalent modifications of proteins after they are translated by the 
ribosomes. By introducing new functional groups, such as phosphate and carbohydrates, PTMs extend the chemical 
repertoire of the 20 standard amino acids and play important roles in regulating the folding, stability, enzymatic 
activity, subcellular localization and interaction of a protein with other proteins. Common PTMs involving 
structural changes to the polypeptide include proteolytic cleavage and disulfide bond formation, whereas common 
PTMs involving the addition of functional groups include phosphorylation, glycosylation and lipidation (such as 
palmitoylation and myristoylation). Proteins can also be modified by the covalent conjugation of one or more, 
smaller proteins or peptides, as in the case of ubiquitination, SUMOylation, ISGylation and NEDDylation. PTMs 
are almost always catalyzed by modifying enzymes. For example, N-linked glycosylation requires the sequential 
activities of enzymes that synthesize the precursor dolichol-linked oligosaccharide, oligosaccharyltransferase that 
transfers the glycan to a specific consensus sequence (N-X-S/T, where X is any amino acid except proline), and 
glycosidases and glycosyltransferases that mediate further processing of the N-linked glycan. On the other hand, 
protein ubiquitination requires three types of enzymes: ubiquitin-activating enzymes (E1), ubiquitin-conjugating 
enzymes (E2) and ubiquitin ligases (E3), acting sequentially in a highly regulated manner. 

Being obligate intracellular parasites, viruses rely on the protein synthesis machinery of host cells to support their 
replication. Therefore, it is not surprising that many viral proteins are modified by PTMs. Accumulating evidence 
suggests that coronavirus proteins are modified by various kinds of PIMs, which remarkably affect viral replication 
and pathogenesis. In this review, we summarize the current knowledge on PTMs of coronavirus proteins, including 
structural, nonstructural and accessory proteins, with an emphasis on their roles and function in coronavirus biology 
and host—virus interaction. The ability of some CoV proteins to interfere with PTMs of host proteins will also be 
discussed. 


S protein 

S protein is the largest among the four coronavirus structural proteins. It is a type I transmembrane protein 
(Figure 2A), with a large N-terminal ectodomain, a single transmembrane (TM) domain and a short C-terminal 
endodomain [9,16]. In most coronaviruses, the S protein is cleaved by host proteases into two functional subunits of 
roughly the same size [17,18]. The N-terminal S1 domain makes up the globular head of the S protein and harbors 
the receptor binding domain (RBD), whereas the S2 domain constitutes the stem of the S protein, containing the 
fusion peptide followed by two heptad repeat regions (HR1 and HR2), the TM domain and the cytosolic tail [19]. 
The luminal (virion exterior) ectodomain of coronavirus S protein is modified by N-linked glycosylation and 
disulfide bond formation, whereas conserved cysteine residues in the cytosolic tail are modified by palmitoylation 


(Figure 2B & Table 1) [20-22]. 


Disulfide bond formation 

Disulfide bonding contributes to the folding of MHV S proteins. When MHV-infected cells were briefly exposed 
to reducing agent dithiothreitol added to culture medium, newly synthesized MHV S protein was completely 
reduced, as indicated by a shift of mobility in nonreducing gel [20]. Reduction of MHV S protein was associated 
with a loss of conformation, as the protein could no longer be recognized by a conformation-specific monoclonal 
antibody. When dithiothreitol was withdrawn, the S protein folded aberrantly into disulfide-linked aggregates, 
from which properly folded S protein subsequently dissociated [20]. Therefore, disulfide bond formation is essential 
for the correct folding, trafficking and trimerization of MHV S protein. 

In another study, the recombinant S1 domain of SARS-CoV S protein was used to study the redox state of the 20 
cysteine residues [23]. Interestingly, four cysteines remained unpaired in mature S1, and chemical reduction using 
B-mercaptoethanol did not impair the binding of S1 to the cognate receptor ACE2. Furthermore, treatment of 
sulfhydryl-blocking agent (DTNB) or the oxidoreductase inhibitor bacitracin did not inhibit the fusion of SARS- 
CoV pseudotyped particles, while the fusion of HIV- or MLV-pseudotyped virus was significantly affected [23]. 
These data suggest that the S1 domain of SARS-CoV S protein exhibits a high level of insensitivity to redox state. 
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Figure 2. Schematic diagram showing the membrane topology and PTMs of coronavirus S protein. (A) Membrane topology of 
coronavirus S protein. The trimeric S protein, the six-helix bundle of S2 and the globular $1 domains are illustrated. (B) Major functional 
domains and PTMs on coronavirus S protein (not to scale). Protein—protein interactions involving some of the modified residues are also 


indicated. 


Endo: Endodomain; FP: Fusion peptide; HR: Heptad repeat; N-gly: N-glycosylation; Palm: Palmitoylation; RBD: Receptor-binding domain; 
S: Spike; TM: Transmembrane domain. 
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N-linked glycosylation 

N-linked glycosylation of coronavirus S protein was first described for MHV in the 1980s [21]. MHV S protein 
in the rough ER was found to acquire high mannose oligosaccharides. Treatment of the Golgi transport blocker 
monensin inhibited the transport of MHV S protein from trans-Golgi network to the cell surface [21]. Later studies 
demonstrated that S proteins of IBV [24], TGEV [25,26], bovine coronavirus (BCoV) [27] were also modified by 
N-linked glycosylation. Using pulse-chase experiments coupled with fractionation, it was found that high mannose 
glycans were acquired by monomer of the TGEV S protein, followed by the rate-limiting assembly of monomers 
into a trimeric structure and terminal glycosylation of the newly assembled trimers [28]. Similarly, SARS-CoV S 
protein was found to acquire high mannose oligosaccharides and trimerize as early as 30 min postentry into ER, 
prior to the acquisition of complex glycans in the Golgi complex [29]. The maturation status of SARS-CoV S protein 
can thus be monitored by its sensitivity to endoglycosidase H (endo H), which hydrolyzes high mannose glycans 
but not complex glycans [30]. Using mass spectrometry, the structure of N-linked glycans on SARS-CoV S protein 
was determined, which were composed of high mannose, hybrid and complex glycans with and without bisecting 
N-acetyl-galactosamine (GalNAc) and core fucose [31]. 
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Table 1. Post-translational modifications in coronavirus proteins and proposed functions. 


Protein Modification Function 
S Disulfide bridge Protein folding and trimerization (MHV) 
N-glycosylation Constitute neutralizing epitopes (TGEV, BCoV, SARS-CoV, IBV,); 
mutations lead to antigenic shift (IBV); 
membrane fusion (IBV); 
NOT required for receptor binding (SARS-CoV); 
lectin-mediated virion attachment (SARS-CoV); 
activation of innate immunity (TGEV) 
Palmitoylation S protein trafficking and folding (MHV, SARS-CoV); 
viron assembly and infectivity (MHV, TGEV); 
interaction between S and M (MHV) 
E N-glycosylation Contribute to two distinct membrane topologies (SARS-CoV); 
NOT required for interaction between E and M (SARS-CoV) 
Palmitoylation Virion assembly and infectivity (MHV); 
E protein stability and traficking (MHV) 
M O-glycosylation NOT required for virion assembly (MHV); 
induction of type I interferon (MHV) 
N-glycosylation NOT required for virion assembly (SARS-CoV); 
M protein folding and traficking (SARS-CoV) 
N Phosphorylation Virion assembly ? (MHV, IBV, BCoV); 
increase specificity of RNA binding (IBV); 
DDX1 recruitment and facilitate template read-through (MHV); 
N protein subcellular localization (SARS-CoV); 
antigenicity of N protein (SARS-CoV) 
Cleavage Cleaved by caspase 3/6/7 (TGEV, SARS-CoV, IBV) 
ADP-ribosylation ADP-ribosylated protein incorporated into virion (MHV) 
Sumoylation Promote homodimerization (SARS-CoV) 
Nsp4 N-glycosylation Viral RNA synthesis and DMV formation (MHV) 
Nsp9 Disulfide bridge Enhance binding affinity to ssRNA/ssDNA (HCoV-229E) 
Nsp16 Ubiquitination Proteasomal degradation (SARS-CoV) 
HE N-glycosylation Unknown (MHV, BCoV) 
3a O-glycosylation Unknown (SARS-CoV) 
8ab N-glycosylation Protect 8ab from proteasomal degradation (SARS-CoV) 
3b N-glycosylation Unknown (TGEV Purdue strain) 


BCoV: Bovine coronavirus; DMV: Double membrane vesicle; IBV: Infectious bronchitis virus; MHV: Mouse hepatitis virus; SARS-CoV: Severe acute respiratory syndrome coron- 


avirus; TGEV: Transmissible gastroenteritis virus. 


With the advent of molecular cloning technologies, the coding sequences of S proteins from numerous coron- 


aviruses were cloned and the putative N-linked glycosylation sites were predicted from the sequence information. 
For instance, 20 [32] or 21 [33] glycosylation sites were predicted in the S protein of MHV, 19 in bovine enteric 
coronavirus [34,35], 30 in HCoV-229E [36], 33 in TGEYV [37], 20 [38] or 22 [39] in HCoV-OC43, 29 [40] or 27 [41] in 
porcine epidemic diarrhea virus (PEDV), 33 in feline enteric coronavirus [42], 29 or 33 in canine coronavirus [43], 
20 or 21 in canine respiratory coronavirus [44]. 

However, it should be noted that not all of the putative glycosylation sites are functional. In fact, among the 23 
putative glycosylation sites in the SARS-CoV S protein, only 12 sites were actually glycosylated, as determined by 
mass spectrometry following peptide: N-glycosidase F (PNGase F) digestion [45]. Recently, we have used in solution 
deglycosylation combined with mass spectrometry to determine the N-linked glycosylation sites in the IBV S 
protein [46]. As deglycosylation was carried out in the H)O!® environment, incorporation of O'® to Asp resulted 
in a mass increment of 2.98 Da, leading to a more robust identification of glycosylated sites by mass spectrometry. 
Among the 29 predicted N-linked glycosylation sites, only eight sites were confirmed using this method. Therefore, 
majority of the predicted N-linked glycosylation sites on coronavirus S protein may not be modified, possibly due 
to the massive amount of S protein produced during infection and the limited capacities of the cellular glycosylation 
enzymes. Some sites may be preferentially modified due to their relatively better spatial availability, while some 
inefficiently and/or partially glycosylated sites may not reach the detection limit for mass spectrometry [46]. Thus, 
the predicted glycosylation sites are not fully utilized in coronavirus S protein. Preferential glycosylation on certain 
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critical sites, such as those located within or near the RBD, may be of particular importance in the functionality of 
S protein. 

N-linked glycosylation contributes significantly to the conformation of coronavirus S protein, and therefore 
profoundly affects the receptor binding and antigenicity of S protein. For example, early studies showed that the 
binding of IBV neutralizing antibodies was dependent on the glycosylation of the IBV S protein [47]. Consistently, 
mutations that introduced new N-linked glycosylation sites in the S1 domain were shown to contribute to antigenic 
shifting of IBV [48]. Also, when the S1 domain of BCoV S protein was cloned and expressed in insect cells, the 
mature protein was glycosylated and bound by neutralizing monoclonal antibodies [49]. In contrast, when cells were 
infected with TGEV in the presence of tunicamycin, an inhibitor of N-linked glycosylation, the antigenicity of 
both S and M protein was significantly reduced [50]. Similarly, when the overexpressed full-length homotrimeric 
SARS-CoV S protein was treated with PNGase F under a native condition, the protein was no longer recognized by 
neutralizing antisera raised against purified virions [51]. This finding suggests that N-linked glycosylation may play 
an important role in constituting the native structure of coronavirus S protein, thereby affecting its antigenicity. 
During its maturation in the ER, SARS-CoV S protein binds to the molecular chaperone calnexin [52]. Compared 
with control, SARS-CoV S-pseudotyped virions produced in calnexin-knockdown cells contained S protein with 
aberrant N-glycans and exhibited significantly lower infectivity [52]. As for IBV, we recently showed that N-D or 
N-Q mutations at the N-linked glycosylation site N212 or N276 abolished the function of S protein to induce 
cell-cell fusion and the infectivity of corresponding recombinant viruses [46]. 

Nonetheless, in some instances, the antigenicity of coronavirus S protein does not depend on its glycosylation 
status. For example, when the S protein of TGEV was expressed by recombinant baculovirus in insect cells, the 
recombinant S protein acquired high mannose glycans, but the complete processing into complex glycans was 
not efficient. However, the recombinant TGEV S protein still exhibited antigenic properties and induced a high 
level of neutralizing antibodies [53]. Similarly, a potent neutralizing monoclonal antibody against the S1 protein 
of SARS-CoV could bind to the deglycosylated S1 protein, suggesting that the epitope was not glycosylation- 
dependent [54]. In one early study, the RBD of SARS-CoV S protein was mapped to amino acid residues 319—518, 
which contained two potential glycosylation sites N330 and N357. However, mutation of N330 or N357 to 
either alanine or glutamine did not affect the binding ability of RBD-containing fragment to the cognate receptor 
ACE2 [55]. Later, the structure of RBD of SARS-CoV S protein complexed with human ACE2 was determined, 
and both N330 and N357 were not positioned in the interface where the two proteins interacted [56]. It was 
thus concluded that glycosylation did not always constitute neutralizing epitopes within the RBD. A later study 
exploring recombinant RBD of SARS-CoV S protein as a vaccine candidate found that yeast-expressed recombinant 
RBD (spanning amino acid residues 318—536) with glycosylation sites removed indeed induced a higher level of 
neutralizing antibody in immunized mice, compared with wild type RBD [57]. 

Although not essential for its binding to the cellular receptor ACE2, N-linked glycosylation of SARS-CoV 
may still contribute to efficient attachment of virions to the host cells. The C-type lectin DC-SIGN was shown 
to facilitate cell entry of SARS-CoV [58,59]. The DC-SIGN binding region was mapped to amino acid residues 
324-386 of SARS-CoV, and pseudotyped viruses with mutated N-linked glycosylation sites (N330Q or N357Q) 
had significantly reduced DC-SIGN-binding capacity [60]. In a separate study, seven glycosylation sites (N109, 
N118, N119, N158, N227, N589 and N699) in SARS-CoV S protein were also shown to be critical for virus 
entry mediated by the DC-SIGN and/or L-SIGN [61]. The interaction between N-linked glycans and lectins can 
also negatively affect receptor binding of coronavirus. For example, mannose-binding lectin was shown to interact 
with SARS-CoV S-pseudotyped virus and block viral binding to DC-SIGN, and N-linked glycosylation at N330 
was found critical for the specific interaction between mannose-binding lectin and SARS-CoV S protein [62]. Since 
N330 is also critical for DC-SIGN-binding, competitive binding between the two lectins to N-linked glycans on 
SARS-CoV S protein may have some implications in the attachment and entry of virions. At last, LSECtin, a lectin 
coexpressed with DC-SIGN on sinusoidal endothelial cells in the liver and lymph node, was also shown to interact 
with SARS-CoV S-pseudotyped virus [63]. 

N-linked glycosylation may also contribute to the activation of innate immune response in coronavirus-infected 
cells. Pretreatment of TGEV-infected cells with the plant lectin concanavalin A before exposure to porcine peripheral 
blood mononuclear cells led to a dose-dependent reduction in the induction of IFN-a. Also, inhibition of N- 
linked glycosylation by tunicamycin or removal of N-linked glycans by PNGase F reduced TGEV-induced IFN-a 
production [64]. Therefore, N-linked glycans on coronavirus S protein may be a pathogen-associated molecular 
pattern recognized by host pattern recognition receptors, which in turn activate downstream antiviral innate 
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immune response. However, compared with the parental PEDV strain, the more effective host immune response 
against the cell attenuated Zhejiang08 strain was associated with the lack of a potential glycosylation site in its 
S protein [65]. Thus, the effect of S protein glycosylation on the immune response is complex, which may vary 
depending on the specific coronavirus and host system in question. 

Caution should also be taken regarding the biological systems used to express the coronavirus S protein. For 
example, a recent study evaluated the antigenicity of recombinant IBV S1 protein expressed in mammalian cells. 
The result showed that the recombinant S1 protein was highly glycosylated and was able to induce the production of 
antibodies against S1 in immunized chickens. However, these antibodies had lower neutralizing activity compared 
with those generated by chickens immunized with inactivated IBV [66]. Therefore, the glycosylation pattern of 
IBV S protein synthesized in mammalian cells may differ from those produced in avian cells, thereby affecting 
its antigenicity zm vivo. Similarly, the glycosylation pattern of other coronavirus proteins may also be differentially 
affected by the expression systems, thereby changing their behaviors in relevant functional assays. 

Interestingly, some of the known cellular receptors for coronavirus have also been shown to be modified by 
glycosylation. N-linked glycosylation of CEACAM1, the cellular receptor protein of MHV, was found essential 
for its binding to MHV-A59 virions [67], although recombinant proteins with mutations in the three N-linked 
glycosylation sites in the N-terminal domain were still functional [68]. On the other hand, insertion of an N-linked 
glycosylation site into human APN, the receptor for HCoV-229E, abolished its activity to bind HCoV-229E 
virions [69]. Similarly, N-linked glycosylation of DPP4, the cognate receptor of MERS-CoV, dramatically affects its 
binding to MERS-CoV S protein. Normally, mouse DPP4 does not support MERS-CoV entry. However, when 
the N328 glycosylation site was mutated in the presence of a secondary mutation A288L, the binding affinity of 
mouse DPP4 to MERS-CoV was significantly increased [70]. Conversely, when the corresponding glycosylation site 
was introduced to human DPP4, the binding of MERS-CoV was significantly reduced [70]. Therefore, glycosylation 
of coronavirus receptors contributes significantly to the host tropism of coronavirus infection, although additional 
sequence and structural determinants of S protein are also involved [71]. 


Palmitoylation 

Palmitoylation of coronavirus S protein was initially identified in cells infected with MHV-A59, as >H-palmitate 
was found to be incorporated in unglycosylated S protein in MHV-infected cells treated with tunicamycin [22]. 
Treatment of palmitoyl acyltransferase inhibitor 2-bromopalmitate at a nontoxic dose reduced palmitoylation of 
MHYV S protein and led to a significant reduction in the infectivity of MHV [72]. Reduction of S palmitoylation 
correlated with a decreased level of S associated with M protein and subsequent exclusion of S from virions. 
However, underpalmitoylated S protein could still be expressed on the cell surface to induce cell-cell fusion. 
The C1347F/C1348S mutant virus harboring mutations in the putative palmitoylation sites exhibited reduced 
infectivity, further supporting the importance of palmitoylation in virion assembly and infectivity [72]. Using 
antiviral heptad repeat peptides that only bind to folding intermediates of the fusion process, it was found that 
MHV S mutants lacking the palmitoylated cysteines were trapped in translational folding states almost ten-times 
longer than wild-type MHV S protein, leading to slower cell entry and reduced infectivity [73]. In a later study 
using reverse genetics, the nine cytoplasmic cysteines in MHV S protein were singly or doubly substituted to 
alanine [74]. Interestingly, no single specific cysteine in the MHV S endodomain was essential for viral replication, 
but a minimum of three cysteines within the motif independent of position was required for the recovery of viable 
recombinant MHV [74]. 

The cytoplasmic portion of SARS-CoV S protein contains four cysteine-rich clusters. Mutational analysis showed 
that cysteine clusters I and II were modified by palmitoylation. Although cell surface expression of SARS-CoV S 
protein was not significantly affected by mutations in cysteine clusters I and II, S-mediated cell fusion was markedly 
reduced compared with wild-type protein, suggesting that palmitoylation in the endodomain may be required for 
the fusogenic activity of SARS-CoV S protein [75]. In a later study, a recombinant nonpalmitoylated SARS-CoV 
S protein was generated by mutating all nine cytoplasmic cysteines to alanines [76]. Using this nonpalmitoylated 
mutant, it was shown that similar to MHV S protein, palmitoylation of the SARS-CoV S protein was required 
for its partitioning into detergent-resistant membranes and for cell-cell fusion. However, unlike MHV S protein, 
palmitoylation of SARS-CoV S protein was not required for S—M interaction [76]. Interestingly, treatment of nitric 
oxide or its derivatives led to a reduction in the palmitoylation of SARS-CoV S protein, which affected its binding 
to the cognate receptor ACE2 [77]. 
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Figure 3. Schematic diagram showing the membrane topology and PTMs of coronavirus E protein. (A) Membrane 
topology of coronavirus E protein. The exterior, interior and transmembrane domains of the protein are shown. 

(B) Major functional domains and PTMs on coronavirus E protein (not to scale). 

Endo: Endodomain; MHV: Mouse hepatitis virus; N-gly: N-glycosylation; Palm: Palmitoylation; PTM: Post-translational 
modification; SARS-CoV: Severe acute respiratory syndrome coronavirus; TM: Transmembrane domain. 


The S protein of the A/phacoronavirus TGEV is also modified by palmitoylation, and inhibition of palmitoylation 
by 2-bromopalmitate treatment reduced TGEV replication in cell culture [78]. Although palmitoylation of TGEV S 
protein was essential for its incorporation into virus-like particles (VLP), the interaction between TGEV S and M 
proteins was not affected by the lack of palmitoylation [78]. Therefore, dependent on the coronavirus in question, 
palmitoylation may differentially affect the folding, fusogenic activity and/or protein—protein interaction of S 
protein. Palmitoylation of S protein has not been characterized for other coronaviruses. 


E protein 

E protein is a small protein (8-12 kDa) found in limited amounts in the virion [11]. Current evidence suggests 
E protein as a type I transmembrane protein with a short N-terminal ectodomain and a C-terminal endodomain 
(Figure 3A), but alternative membrane topologies have also been proposed [79,80]. Biophysical studies show that 
some coronavirus E proteins can form pentameric structures exhibiting ion channel activity [12,81]. The E protein 
is reported to be modified by glycosylation and palmitoylation (Figure 3B & Table 1) [80,82]. 


Glycosylation 

Based on sequence prediction, SARS-CoV E protein contains two potential N-linked glycosylation sites on N48 
and N66, whereas IBV E contains one potential site on N5. Although topological study demonstrated that IBV E 
protein spanned the membrane once with a luminal N-terminus and a cytoplasmic C-terminus, the glycosylation 
site on N5 was not functional [79,80]. On the other hand, SARS-CoV E protein in transfected cells seemed to 
adopt two distinct membrane topologies [80]. In one form, both the N- and C-termini were exposed to the 
cytoplasmic side and the protein was not modified by glycosylation. In an alternative minor form, SARS-CoV E 
protein was shown to be glycosylated on N66, with the C-terminus exposed to the luminal side [80]. A later study 
using transfected SARS-CoV E protein with an N-terminal Myc-tag confirmed that SARS-CoV E protein was 
glycosylated co-translationally [83]. Although the two putative TM domains were required for its interaction with 
the SARS-CoV M protein, the hydrophilic region (60-76) flanking the N66 glycosylation site was dispensable 
as shown by co-immunoprecipitation experiment [83]. The glycosylation of SARS-CoV E protein during actual 
infection and its biological function remain to be further investigated. 
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Figure 4. Schematic diagram showing the membrane topology and PTMs of coronavirus M protein. (A) Membrane 
topology of coronavirus M protein. The exterior, interior and transmembrane domains of the protein are shown. 

(B) Major functional domains and PTMs on coronavirus M protein (not to scale). 

Endo: Endodomain; IBV: Infectious bronchitis virus; MHV: Mouse hepatitis virus; N-gly: N-glycosylation; 

O-gly: O-glycosylation; Palm: Palmitoylation; PTM: Post-translational modification; SARS-CoV: Severe acute respiratory 
syndrome coronavirus; TM: Transmembrane domain. 


Palmitoylation 

All the three cysteine residues (C40, C43 and C44) in SARS-CoV E protein are also modified by palmitoylation [82], 
which may regulate its subcellular trafficking and association with lipid rafts. In fact, when the homologous cysteine 
residues in the E protein of MHV-A59 (C40, C44 and C47) were doubly or triply mutated to alanine, its ability 
to induce VLP formation was significantly reduced [84,85]. Moreover, MHV E protein carrying triple mutations 
(C40A/C44A/C47A) was prone to degradation, and the corresponding recombinant MHV had significantly 
reduced yield compared with wild-type [85]. While wild-type MHV E protein mobilized co-expressed M protein 
into detergent-soluble secreted forms, in cells expressing the triple C-to-A MHV E protein, the co-expressed M 
protein accumulated into detergent-insoluble complexes that were not secreted [84]. Therefore, palmitoylation of 
MHV E protein contributes to its stability and biological activity during assembly of mature virions. On the 
other hand, palmitoylation of SARS-CoV E protein is not required for its association with N protein and VLP 
production, and thus possibly dispensable for SARS-CoV assembly [86]. 


M protein 

M protein is the most abundant protein in the coronavirus virion. Composing of 220-260 amino acids, this 
protein is a multipass transmembrane protein (Figure 4A), with a short N-terminal ectodomain, three hydrophobic 
TM domains and a large C-terminal endodomain [9,87]. Homotypic interaction between M protein provides the 
scaffold for virion assembly, while heterotypic interaction recruits other structural protein and genomic RNA to 
the assembly site [88,89]. The only known PTM on coronavirus M protein is glycosylation of its ectodomain, which 
is exclusively O-linked in some Betacoronaviruses but exclusively N-linked in other coronaviruses (Figure 4B & 
Table 1) [90,91]. 


O-linked glycosylation 

O-linked glycosylation of the MHV M protein was first discovered in 1981 [90]. It was found that in the presence 
of tunicamycin, an inhibitor of N-linked glycosylation, synthesis of the S protein was completely inhibited, but M 
protein was still normally produced and glycosylated, resulting in the formation of noninfectious virions containing 
normal amounts of N and M protein, but lacking S completely [90]. When it was expressed from transfected 
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cDNA, M protein of MHV-A59 also underwent O-linked glycosylation and was localized in the Golgi region [92]. 
The structures of the O-linked glycans of MHV-A59 M protein were characterized [93], and pulse-chase labeling 
experiments showed that the O-linked glycans were acquired in a two-step process: GalNAc was added before 
the addition of galactose and sialic acid [94]. After the sequential acquisition of GalNAc, galactose and sialic 
acid, the M protein of MHV-A59 was further modified in the trans-Golgi network [95]. Apart from MHV, O- 
linked glycosylation was also found in the M protein of two other lineage A Betacoronaviruses: BCoV [27,96] and 
HCoV-OC43 [97]. Since its discovery, O-linked glycosylation has been used as a marker to study the maturation, 
membrane insertion and intracellular trafficking of MHV M protein [98,99]. In fact, due to its high expression 
level in transfected or MHV-infected cells, the M protein of MHV has also been used as a model protein to study 
O-linked glycosylation and vesicular trafficking between ER and the Golgi compartments [100]. 

Initial studies proposed the four highly conserved hydroxyamino acids (S2, $3, T4 and T5) at the extreme N 
terminus of MHV M protein as the putative O-linked glycosylation sites [93]. Follow-up investigations further 
pinpointed T5 as the functional acceptor site, and the downstream P8 was also required for efficient O-linked 
glycosylation [101]. However, the hydroxylamino acid cluster per se was not sufficient, as downstream amino acids 
must also be included to introduce a functional O-linked glycosylation site into a foreign protein [101]. Interestingly, 
in the highly virulent strain MHV-2, the S-S-T-T sequence was mutated to N-S-T-T, and N-linked glycosylation 
was shown to be added to the N2 residue [102]. However, whether the presence of extra sugars would affect the 
function of MHV-2 M protein has not been fully understood. 

O-linked glycosylation is not essential for the assembly of MHV virions, as mutations that abolished the 
normal O-linked glycosylation site did not inhibit the budding of infectious virions [103] or growth kinetics in cell 
culture [104]. However, it was found that recombinant MHV containing N-linked glycosylated M protein induced 
a higher level of type I interferon compared with the wild-type MHV with O-linked glycosylated M protein, 
whereas MHV with nonglycosylated M protein was a poor interferon inducer in cell culture [104]. The zm vitro 
interferongenic capacity also correlated with the abilities of these viruses to replicate in the liver of infected mice, 
suggesting that glycosylation status of M protein might affect the induction of innate immune response by MHV 
infection [104]. 


N-linked glycosylation 

Distinct from the O-link glycosylation observed in the M protein of MHV, BCoV and HCoV-OC43, the M 
protein of Alphacoronavirus TGEV [105] and PEDV [106], as well as Gammacoronavirus IBV [91] and turkey enteric 
coronavirus [107] are all modified by N-linked glycosylation, which is sensitive to endo H and can be inhibited 
by tunicamycin. The N-linked glycosylation sites were mapped to N3 and N6 of IBV (unpublished data from 
this group). Within the Betacoronavirus genus, M protein of coronaviruses in other lineages is also N-linked 
glycosylated. For example, SARS-CoV M protein contains a single N-glycosylation site at N4 [108,109]. When 
transiently transfected as a C-terminally FLAG-tagged protein, SARS-CoV M protein was found to obtain high- 
mannose N-glycans that were modified into complex N-glycans in the Golgi [29]. However, in a later study using 
SARS-CoV infected cells and purified SARS-CoV virions, glycosylated M protein was shown to remain endo H 
sensitive, suggesting that trimming and maturation of N-linked glycans were inhibited during actual SARS-CoV 
infection [109]. 

Similar to O-linked glycosylation of MHV, N-linked glycosylation of SARS-CoV M protein is not essential for 
viral replication, as recombinant SARS-CoV with glycosylation-deficient M protein had normal virion morphol- 
ogy and retained its infectivity in cell culture [110]. However, unlike O-linked glycosylation that conferred IFN 
antagonism to the MHV M protein, the IFN-antagonizing activity of SARS-CoV M protein was independent of 
N-linked glycosylation and might be mediated through its first TM domain [111]. 


N protein 

N protein (43-50 kDa) is the protein constituent of the helical nucleocapsid, which binds the RNA genome in 
a beads-on-a-string fashion. The N protein contains two major domains (Figure 5), an N-terminal domain and 
a C-terminal domain [112,113]. While both domains contribute to the binding of viral RNA genome, C-terminal 
domain is also important for N protein dimerization [114,115]. Linking these two major domains is a serine arginine- 
rich motif that may play an important role in the multimerization of N protein [116]. At last, domain 3 at the 
C-terminus is shown to be critical for interaction between coronavirus N and M protein [117]. N protein is mainly 
modified by phosphorylation, which usually occurs in clusters in the N-terminal domain, serine arginine-rich or 
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Figure 5. Schematic diagram showing major functional domains and post-translational modifications on 
coronavirus N protein (not to scale). Protein—protein interactions involving some of the modified residues are also 
indicated. 

CTD: C-terminal domain; IBV: Infectious bronchitis virus; MHV: Mouse hepatitis virus; NTD: N-terminal domain; 
phos: Phosphorylation; PTM: Post-translational modification; SARS-CoV: Severe acute respiratory syndrome 
coronavirus; SR: Serine arginine-rich domain; TGEV: Transmissible gastroenteritis virus. 


domain 3 [118,119]. Moreover, proteolytic cleavage, SUMOylation and ADP-ribosylation have also been observed 
in coronavirus N protein (Figure 5 & Table 1 ) (120-1221. 


Phosphorylation 

Phosphorylation of coronavirus N protein was first described in ip60K cells infected with MHV-JHM, where a 
protein kinase associated with purified virions was shown to transfer the y-phosphate of ATP to serine residues 
to the MHV N protein [118]. A later study showed that the MHV-JHM N protein was synthesized initially in a 
nonphosphorylated 57-kDa form detected exclusively in the cytosol, while the subsequent phosphorylated 60-kDa 
form was associated with the cellular membrane fraction and mature virion [123]. Similarly, **P-orthophosphate 
labeling showed that the phosphorylation level of IBV N protein was significantly higher in the virion than in the 
infected cell lysates [124]. In sharp contrast, only the phosphatase insensitive nonphosphorylated form of N protein 
was detected in extracellular virions of BCoV-infected cells, suggesting that dephosphorylation of BCoV N protein 
may facilitate its specific assembly [125]. Therefore, the phosphorylation status of N protein may differentially 
regulate coronavirus assembly for different viruses in question. 

Phosphorylation sites and the corresponding protein kinases have been identified for some coronaviruses. For the 
Alphacoronavirus TGEV, four phosphorylation sites have been identified in the N protein, namely S9, S156, S254 
and $256 [126]. Using mass spectroscopy, two clusters of phosphorylation sites were identified in IBV, namely amino 
acid residues $190/S192 and T378/S379 (127). Importantly, although both phosphorylated and nonphosphorylated 
IBV N protein bound to viral RNA with the same affinity, phosphorylated N protein bound to viral RNA with 
higher affinity than nonviral RNA, compared with the nonphosphorylated IBV N protein [127]. This suggests 
that N phosphorylation may facilitate the differential recognition of viral RNA. Consistently, using a reverse 
genetic system based on Vaccinia virus, Spencer et al. showed that IBV N protein was essential for the recovery of 
recombinant IBV, and that phosphorylated IBV N protein was more efficient than partially or nonphosphorylated 
N protein [128]. Phosphorylation at T378 and S379 of IBV N protein was shown to be dependent on ATR, a 
kinase activated during IBV replication [119]. However, recombinant IBV harboring alanine substitutions at all 


four putative phosphorylation sites (S190A/S192A/T378A/S379A) could still be recovered and grew at a similar 
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growth rate as wild-type IBV, suggesting that AT R-dependent phosphorylation of N protein is not essential for 
IBV replication in vitro [119]. 

As for betacoronavirus, the N protein of SARS-CoV can be phosphorylated by multiple host kinases, including 
cyclin-dependent kinase, glycogen synthase kinase, mitogen-activated protein kinase and casein kinase II [129]. 
Using mass spectrometry analysis, six phosphorylation sites (S162, $170, T177, S389, S424 and T428) on the 
MHV-A59 N protein were identified [130]. Phosphorylation of the N protein of SARS-CoV and MHV-JHM by the 
host protein GSK-3 was precisely mapped to $197 and S177 in the serine arginine-rich region, respectively [131]. 
Moreover, inhibition of GSK-3 by kenpaullone significantly reduced the phosphorylation level of N protein, 
as well as the supernatant virus titer and cytopathic effects on VeroE6 cell-infected SARS-CoV [131]. Therefore, 
phosphorylation of the N protein appears to be essential for the replication of some Betacoronaviruses. In fact, 
a recent study showed that phosphorylation of the MHV-JHM N protein by GSK-3 allowed the recruitment 
of RNA helicase DDX1 to facilitate template read-through, enabling the synthesis of genomic RNA and longer 
sgRNAs [132]. On the other hand, when N protein was not phosphorylated, template switching was favored during 
transcription, leading to the preferential generation of shorter sgRNAs but not genomic RNA or longer sgRNAs [132]. 
Therefore, the phosphorylation status of MHV-JHM N protein acts as a switch to regulate the process of genome 
replication/transcription. 

Phosphorylation of the SARS-CoV N protein may also affect its nucleocytoplasmic shuttling, which is mediated 
by its interaction with the host adapter protein 14-13-3 [129]. Additionally, SARS-CoV N protein was shown 
to translocate to cytoplasmic stress granules in response to cellular stress, while phosphorylation in the serine- 
arginine rich region inhibited this translocation [133]. Since stress granules play important roles in translation 
control and antiviral immune response, phosphorylation of N protein may be a strategy used by SARS-CoV to 
antagonize host antiviral mechanisms [134]. At last, compared with SARS-CoV N protein expressed in Escherichia coli, 
recombinant SARS-CoV N protein produced by the baculovirus system in insect cells showed significantly higher 
immunoreactivity and antigenic specificity [135]. As dephosphorylation by PP 1 also reduced the immunoreactivity of 
SARS-CoV N protein, it was proposed that phosphorylation might also contribute to the antigenicity SARS-CoV 
N protein [135]. 


Proteolytic cleavage, sumoylation & ADP-ribosylation 

One early study shows that the N protein of TGEV was cleaved at D359 during the late stage of infection, 
presumably by the activated caspase-6 and -7 during TGEV-induced apoptosis [136]. Similarly, the N protein of 
SARS-CoV was also cleaved at D400 and D403 by caspases during lytic infection in Vero E6 and A549 cells, but 
not during persistent infection in Caco-2 and N2a cells [137]. Cleavage of the SARS-CoV N protein was mediated 
by caspase-6 and/or caspase-3, and was dependent on the nuclear localization of the N protein [137]. We have 
also observed cleavage of the IBV N protein during late stage IBV infection [138,139]. Thus proteolytic cleavage 
of the N protein may be a common outcome associated with coronavirus-induced apoptosis in the infected cells, 
although the biological significance is not known. Presumably, coronavirus N protein may compete with other 
caspase substrates for cleavage, so as to promote cell survival in order to prolong the duration of virion release. 

Yeast two-hybrid screen identified Ubc9, a host protein involved in sumoylation, as an interacting partner of 
SARS-CoV N protein [140]. Biochemical analysis confirmed that SARS-CoV N protein was modified by sumoy- 
lation at lysine 62, which significantly promoted homo-oligomerization of the N protein [120,140]. The biological 
significance of this modification on the viral replication and coronavirus—host interactions remains to be investi- 
gated. 

A novel form of PTM known as ADP-ribosylation was recently recognized, in which single or multiple ADP- 
ribose moieties are covalently attached to a protein. This process is catalyzed by enzymes called poly-ADP-ribose 
polymerases and utilizes nicotinamide adenine dinucleotide as the ADP-ribose donor. Interestingly, N proteins of 
MHV, PEDV, SARS-CoV and MERS-CoV were all shown to be ADP-ribosylated in the infected cells, while ADP- 
ribosylated MHV N protein was also detected in the purified virions [121]. Notably, MHV N protein expressed from 
transfected plasmids was only ADP-ribosylated in the context of virus infection, suggesting that enzymes catalyzing 
this modification are activated by coronavirus infection and additional viral components may be involved. 
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Figure 6. Schematic diagram showing the post-translational modifications on coronavirus nonstructural proteins and accessory 


proteins. The cleavage sites of PLPro and Mpro are indicated by empty circles and reverse triangles, respectively. 


IBV: Infectious bronchitis virus; MHV: Mouse hepatitis virus; N-gly: N-glycosylation; nsp: Nonstructural protein; O-gly: O-glycosylation; 
PLPro: Papain-like protease; pp: Polyprotein; SARS: Severe acute respiratory syndrome; TGEV: Transmissible gastroenteritis virus; 


Ubi: Ubiquitination. 


Nsps & accessory proteins 

Glycosylation of nsp3 & nsp4 

Among all the coronavirus nsps, three of them are known to contain TM domains that facilitate their insertion 
into ER membrane. Nsp3 and nsp4 have two and four TM domains respectively [141], while nsp6 contains six 
TM domains with a hydrophobic C-terminal cytosolic tail [142]. These three nsps are proposed to reorganize ER 
membrane to form DMVs and to facilitate the assembly and anchorage of the replication/transcription complex to 
the DMVs. In fact, co-expression of SARS-CoV nsp3, nsp4 and nsp6 induced DMV formation in the transfected 
cells [143]. A more recent study showed that, for both MERS-CoV and SARS-CoV, co-expression of nsp3 and nsp4 
was already sufficient to induce DMV formation [144]. On the other hand, overexpression of coronavirus nsp6 
induced the formation of autophagosomes [145], but at the same time restricted its expansion [146]. Therefore, nsp3, 
nsp4 and nsp6 are closely associated with cellular membrane dynamics in coronavirus-infected cells. 

Given their membrane multispanning nature, it is not surprising that some of the luminal domains undergo 
N-linked glycosylation in the ER (Figure 6). For example, MHV nsp3 is inserted into ER co-translationally and 
glycosylated at N1525 [147]. Glycosylation of nsp4 was first identified in IBV (Lim et al., 2000). By glycosidase 
digestion and site-directed mutagenesis, the glycosylation site of IBV nsp4 was confirmed to be at N48 [148]. As 
for the nsp4 of MHV, two glycosylation sites were predicted at N176 and N237. In one early study using reverse 
genetics, it was found that whereas recombinant MHV harboring nsp4-N176A mutation replicated identically to 
the WT control, nsp4-N237A was lethal and no recombinant virus could be recovered [149]. In a later study using 
identical infectious clone system based on MHV-A59, Gadlage et al. successfully recovered recombinant MHV with 
N176A, N273A or N176A/N273A mutation in nsp4 [150]. Interestingly, all nsp4 glycosylation mutants exhibited 
aberrant morphology of DMVs and were defective in viral RNA synthesis and virus growth, supporting a critical 
role of N-linked glycosylation in the DMV formation activity of MHV nsp4 [150]. In a recent follow-up study, other 
mutations distinct from glycosylation sites were introduced in MHV nsp4. Similar to the glycosylation mutants, 
some of these mutants also exhibited altered DMV morphology. However, only mutations in the nsp4 glycosylation 
sites resulted in a loss of fitness in the recombinant MHV (151). Therefore, apart from DMV formation, N-linked 
glycosylation of MHV nsp4 may serve other critical roles during viral replication. 
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Disulfide bond formation of nsp9 of HCoV-229E 

Coronavirus nsp9 has been characterized as an ssRNA binding protein [152]. The colocalization of nsp9 with 
other replicase proteins [153] and its interaction with the coronavirus RdRP [154] suggested that the ssRNA-binding 
activity of nsp9 might play a role during coronavirus genome transcription/replication. Crystallography studies 
showed that SARS-CoV nsp9 formed homodimer [154], and higher oligomers could be observed in solution using 
glutaraldehyde cross-linking [155]. Surprisingly, in spite of 45% sequence homology, nsp9 of HCoV-229E (but not 
SARS-CoV) was shown to form homodimer linked by a disulfide bond (Figure 6) [155]. Mutation of the disulfide 
bond forming cysteine 69 to either alanine or serine significantly reduced the binding affinity of HCoV-229E 
nsp9 to ssRNA or ssDNA, as determined by surface plasmon resonance experiments. Although disulfide bonds 
are rare in cytosolic proteins, a disulfide-bonded form of nsp9 may be correlated with oxidative stress induced by 
HCoV-229E infection [155]. 


Ubiquitination of nsp16 of SARS-CoV 

Coronavirus nsp16 has been identified as a nucleoside-2’O-methyltransferase (2'-O-M Tase) [156]. By modifying the 
cap-0 structure at the ribose 2’-O position of the first nucleotide to form cap-1 structures, nsp16 enables the viral 
RNA to avoid detection by the cytoplasmic pattern recognition receptor MDAS. In fact, compared with wild-type 
control, recombinant virus lacking the nsp16 2’-O-MTase activity induced a high level of type I interferon in 
the infected cells, and viral replication was highly sensitive to the antiviral function of exogenous interferon [157]. 
Using yeast two-hybrid screening, a component of E3 ubiquitin ligase — von Hippel Lindau (VHL) was found 
to interact with SARS-CoV nsp16 [158]. Overexpression of VHL promoted the ubiquitin-proteasomal degradation 
of SARS-CoV nsp16, while knockdown of VHL increased the protein stability of nsp16 [158]. However, the 
precise ubiquitination site in SARS-CoV nsp16 has not been mapped, and similar modifications of nsp16 in other 
coronaviruses have not been characterized. Also, the physiological significance of nsp16 ubiquitination remains to 
be investigated using recombinant viruses under the setting of actual coronavirus infection. 


PTMs of coronavirus accessory proteins 

Apart from the structural and nonstructural proteins, coronavirus genome also encodes various accessory proteins, 
most of which share no homology to any known proteins. These accessory proteins are dispensable for viral 
replication in cell culture. In fact, when the coding sequences of accessory proteins were deleted by reverse 
genetics, the resulting recombinant viruses still replicated similarly to wild-type virus. [159] However, some of 
the coronavirus accessory proteins are incorporated in mature virions, while others have been implicated in the 
modulation of host immune response and in vivo pathogenesis [159]. Only a few coronavirus accessory proteins are 
known to be modified by PTMs (Figure 6 & Table 1). 

Apart from the S protein, some Betacoronaviruses also encode the HE protein, which forms homodimers and 
constitutes a second type of shorter projections on the virion surface [10]. Similar to S protein, the HE protein 
of MHV was also found to be modified by N-linked glycosylation, which was inhibited by tunicamycin but not 
monensin [160]. The HE protein of BCoV was also shown to be glycosylated when expressed using a human 
adenovirus vector [161]. The importance of N-linked glycosylation for the function of coronavirus HE protein has 
not been fully characterized. 

Interestingly, although SARS-CoV M protein is N-linked glycosylated, its accessory protein 3a is O-linked glyco- 
sylated [162]. The SARS-CoV protein 3a and M share the same N-exo/C-endo membrane topology, and both proteins 
contain three TM domains [163]. O-linked glycans of the SARS-CoV protein 3a were resistant to the treatment of 
PNGase F, and pulse-chase analysis suggested that the oligosaccharides were acquired post-translationally [162]. Pro- 
tein 3a has been implicated in modulating host immune response, such as upregulating fibrinogen expression [164] 
and production of proinflammatory cytokines [165]. However, whether O-linked glycosylation contributes to the 
immune-modulating activities of SARS-CoV protein 3a is not known. 

In animal isolates and early human isolates, the ssRNA8 of SARS-CoV encoded a single protein 8ab. However, 
in later human isolates during the peak of SARS-CoV epidemic, a 29-nt deletion in the center split ORF8 into two 
smaller ORFs, encoding proteins 8a and 8b respectively [166]. Whereas protein 8ab is co-translationally imported 
into the ER and is N-linked glycosylated at N81, protein 8b is synthesized in the cytosol and not modified [167]. 
Both proteins 8b and 8ab were shown to interact with mono-ubiquitin and polyubiquitin, and both were also 
modified by ubiquitination. However, whereas glycosylation at N81 stabilized protein 8ab and protected it from 
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proteasomal degradation, protein 8b was highly unstable and underwent rapid proteasomal degradation [168]. The 
ubiquitinated 8b and 8ab may mediate rapid degradation of IRF3 and regulate host antiviral innate immunity [169]. 

The accessory protein 3b of TGEV is encoded between the S and M genes in the Purdue strain, but it is 
truncated in some lab-passaged strains. TGEV 3b protein is translated via an internal entry mechanism, possibly 
in conjunction with leaky scanning [170]. In cells infected with the Purdue strain of TGEV, two forms of 3b 
protein were detected: a 31 kDa N-glycosylated, membrane-associated form and a 20 kDa nonglycosylated soluble 
form [171]. The TGEV 3b protein was not essential for viral replication and was not incorporated in mature virion. 
Its role in pathogenesis is not completely understood, although deletion of ORF3b was found in some naturally 
attenuated TGEV strains such as Miller M60 [172]. 


Interfering PTMs of host proteins by coronavirus proteins 

Deubiquitinating activity of coronavirus PLPro 

Coronavirus encodes one or two PLPro in the nsp3, which carry out the proteolytic cleavage that releases nsp1, nsp2 
and nsp3 from the polyprotein [15,173,174]. Apart from its protease activity, SARS-CoV PLPro was also shown to 
possess deubiquitinating (DUB) activity [175,176], which was also identified later for PLP2 of HCoV-NL63 [177] and 
MHV-A59 [178], as well as PLPro of MERS-CoV [179,180] and IBV [181]. Structural studies revealed that SARS-CoV 
PLPro shared similar fold with known DUB enzymes, but exhibited several distinct features [182]. Later studies 
showed that apart from ubiquitin, the PLPro of SARS-CoV and MERS-CoV also recognized another ubiquitin-like 
modifier interferon-stimulated gene 15 (ISG15), and served as a deISGylating enzyme [179,180,183]. Interestingly, the 
DUB/delSGylating activity of coronavirus PLPro could be separated from its protease activity. The crystal structure 
of SARS-CoV PLPro in complex with human ubiquitin analog has been determined, and certain mutations in 
the interacting regions were shown to compromise ubiquitin binding without affecting the protease activity of 
PLPro [184]. Similarly, using the structure of MERS-CoV PLPro in complex with Ub as a guide, mutations were 
introduced into PLPro that specifically disrupted the DUB function without affecting its proteolytic activity. Unlike 
wild-type PLPro, the DUB lacking variants were deficient in suppressing IFN promoter activation [185]. 

In terms of biochemistry, PLPro from different coronaviruses seems to have slightly different substrate specificities 
and enzyme properties. SARS-CoV PLPro greatly prefers K48-linked to K63-linked ubiquitin chains. The specificity 
of SARS-CoV PLPro toward polyUb(K48) was proposed to be determined by its extended conformation and 
binding via two contact sites [186]. In contrast, the PLPro of MERS-CoV cleaves polyUb chains with broad linkage 
specificity. Also, whereas MERS-CoV PLPro cleaves polyUb chains one Ub at a time, SARS-CoV PLPro cleaves 
K48-linked polyUb chain in a “di-distributive’ manner — that is, removing a di-Ub moiety at a time [187]. 

Since ubiquitination and [SGylation are critical for signaling transduction of innate immunity, the DUB 
and deISGylating activities of coronavirus PLPro are well characterized as antagonists of host antiviral response 
(Figure 7) [188]. Initial studies identified SARS-CoV PLPro as a potent IFN antagonist by interacting with IRF3 and 
inhibiting its phosphorylation and nuclear translocation, thereby blocking type I IFN production [189]. Subsequently, 
it was found that SARS-CoV PLPro could also inhibit TNFa-induced NF-kB activation [190] and blocked the 
production of proinflammatory cytokines and chemokines in activated cells [179]. The IFN antagonist activity 
of coronavirus PLPro can be mediated by multiple mechanisms, which may or may not involve its protease 
and DUB activities [191]. PLP2 of MHV-A59 was found to directly deubiquitinate IRF3 and prevent its nuclear 
translocation [178]. It also deubiquitinated upstream TBK1 and reduced its kinase activity, thereby inhibiting IFN 
signaling [192]. PLPro of SARS-CoV was shown to remove K63-linked ubiquitin chains from TRAF3 and TRAF6, 
thereby suppressing the activation of TBK1 in cells treated with TLR7 agonist [193]. Alternatively, membrane- 
anchored SARS-CoV PLPro might physically interact with the STING-TRAF3-TBK1 complex to inhibit the 
phosphorylation and dimerization of IRF3, thereby suppressing the STING/TBK1/IKKe-mediated activation of 
type I IFN [194,195]. At last, using a constitutively active phosphor-mimetic IRF3, it was recently shown that the 
DUB activity of PLPro also inhibited IRF3 at a postactivation step [196]. 


Other coronavirus proteins that modulate PTMs of host proteins 

Apart from the most well-characterized DUB/delSGylation activities of coronavirus PLPro, other coronavirus 
proteins have also been implicated in regulating PTMs of host proteins (Figure 7). For example, in addition to the 
DUB activity encoded in the nsp3 of SARS-CoV, its SARS-unique domain (SUD) can also enhance a cellular E3 
ubiquitin ligase called ring-finger and RCHY1, which leads to proteasomal degradation of p53 [197]. Cellular p53 
inhibits replication of SARS-CoV and HCoV-NL63, presumably by activating genes involved in innate immunity. 
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Figure 7. Schematic diagram summarizing the known mechanisms by which coronavirus proteins interfere with the post-translational 
modifications of host proteins. Proteins in blue are pattern recognition receptors, and proteins in purple are E3 ubiquitin ligases. Green 

ellipses and rectangles are transcription factors and response elements involved in type I interferon induction and signaling, respectively. 
Black pointed arrows indicate activation, and blunt-ended red lines indicate inhibition. See text for detail. 


Thus, by targeting p53 for RCHY1-mediated ubiquitination and proteasomal degradation, the SUD of nsp3 
may contribute to the pathogenesis of SARS-CoV [197]. Similarly, SARS-CoV ORF9b was found to localize to 
mitochondria and induce ubiquitination and proteasomal degradation of DRP1, leading to the elongation of 
mitochondrial. It might also hijack a ubiquitin E3 ligase called AIP4 to trigger the degradation of MAVS, TRAF3 
and TRAF6, thereby significantly suppressing IFN responses [198]. 

On the other hand, ubiquitination of some cellular proteins is suppressed by coronavirus proteins. TRIM25 is 
an E3 ubiquitin ligase that associates with and activates RIG-I by mediating its ubiquitination. The N protein 
of SARS-CoV was found to bind to the SPRY domain of TRIM25 and inhibit TRIM25-dependent RIG-I 
activation, thereby suppressing the type I IFN production induced by poly(I:C) or Sendai virus [199]. Similarly, the 
accessory protein 6 of SARS-CoV was shown to interact with the IFN-signaling pathway-mediating protein Nmi 
and promote its ubiquitin-dependent proteasomal degradation, thereby potentially modulating the virus-induced 
innate immune response [200]. 

The enzymatic activity of some nonstructural proteins can directly modify some host proteins. For example, 
porcine deltacoronavirus (PDCoV) nsp5 was shown to mediate the cleavage of NF-kB essential modulator at 
glutamine 231, thereby significantly inhibiting IFN-B production induced by Sendai virus infection [201]. Later, 
it was shown that the nsp5 of PDCoV also cleaved STAT2 and impaired its ability to induce the expression 
of ISGs [202]. Therefore, PDCoV nsp5 mediates the cleavage of key players to inhibit both the production and 


signaling of type I interferons. 


Conclusion 

Accumulating evidence suggests that coronavirus proteins are subjected to various PTMs by the host cells. Trans- 
membrane structural proteins (S, E and M), nonstructural proteins (nsp3 and nsp4) and accessory proteins 
(SARS-CoV 3a) are modified by glycosylation. Although glycosylation of coronavirus S protein is essentially N- 
linked, the M proteins of lineage A Betacoronavirus adopt the special O-linked glycans, while the M proteins of 
other coronaviruses are modified by N-linked glycosylation. Some coronavirus S and E proteins acquire palmitoy- 
lation in the cytosolic cysteine residues, while the N protein is mainly phosphorylated by multiple host kinases. 
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The conserved types and sites of PTMs on these proteins suggest that co-option of PTMs to regulate coronavirus 
proteins has a long evolutionary history, while the diversity of PTMs on numerous coronavirus proteins highlights 
their important implication in viral replication and pathogenesis. 

PTMs contribute significantly to the functions of coronavirus proteins. Apart from facilitating the folding and 
intracellular trafficking of the coronavirus S protein, N-linked glycans also constitute a significant part of the protein 
mass and profoundly affect the conformation of the mature S protein and its binding to surface receptors. N-linked 
glycans may play a role in the antigenicity of S protein, and glycosylation may also contribute to the induction 
of innate immune response, thereby affecting the viral pathogenesis. Phosphorylation of coronavirus N protein 
improves its selective binding to viral RNA and may regulate the uncoating and assembly process during replication. 
Importantly, the phosphorylation status of MHV-JHM N protein, controlled by the host kinase GSK-3, acts as 
a switch to regulate genome replication/transcription, although a similar mechanism has not been described for 
other coronaviruses. Coronavirus also employ multiple mechanisms to interfere with PTMs of host proteins. In 
particular, the DUB and deISGylating activities encoded by nsp3 suppress the induction and signaling of type I 


interferons. 


Future perspective 

The functional implication of PTMs on many coronavirus proteins has not been fully characterized, and their 
biological significance requires further investigations combining reverse genetics and suitable im vivo models. 
However, the presence of multiple modification sites on some functionally important domains of a protein, 
as examplified by the presence of more than 20 predicted N-linked glycosylation sites on various functional 
domains of coronavirus S protein and multiple phosphorylation sites in coronavirus N protein, and the absence of 
sensitive and specific methods for detection of individual PTMs in live cells and infectious particles have hindered 
further investigation into the function of PTMs at a specific site of a coronavirus protein in virus replication and 
pathogenesis. In addition, it appears that the functional effect of mutation at a canonical site can be compensated 
by the same PTM at an alternate site. This is especially true for proteins with multiple sites for a certain PTM, 
such as N-linked glycosylation of S protein and phosphorylation of N protein. 

With the advent of innovative labeling techniques (such as HO! labeling) and the ever-growing capacity 
of mass spectrometry, systematic identification of conventional and novel PTM will be greatly accelerated over 
the next decade. Also, as we better understand the detailed molecular mechanisms behind PTMs, functional 
studies will shift from relying on less specific inhibitors to targeted depletion of key modifying enzymes using 
gene knockdown/knockout approaches based on CRISPR technologies. Assisted by the exquisite structural and 
biochemical investigation of PLPro and other coronavirus proteins, future studies will further reveal the mechanisms 
of how these proteins interfere with host PTMs and modulate viral pathogenesis. Undoubtedly, coronavirus reverse 
genetics will remain the cornerstone for characterizing the biological significance of PTMs in coronavirus proteins, 
which will also be facilitated by the recent development of various transgenic im vivo models. 

In terms of translational applications, PIMs of coronavirus proteins and the interference of PIMs of host 
proteins by coronavirus proteins may be attractive targets for therapeutic intervention. For instance, carbohydrate 
binding agents that directly interact with glycans on the virion surface may be able to suppress virus attachment and 
entry. As more than one N-linked glycosylation sites are present, multiple mutations will be required for the virus 
to develop drug resistance. On the other hand, recombinant coronaviruses with the DUB/delSGylation activity 
specifically deleted from PLPro may be desirable vaccine candidates, as these viruses will retain the protease activity 
required for replication, but become substantially attenuated as they are defective in subverting the host innate 
immune response. Given its importance in both veterinary setting and public healthcare, a better understanding of 
the PTMs of coronavirus proteins will provide new insights into the development of more efficient vaccines and 
novel antivirals. 
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Executive summary 


e Coronaviruses are pathogens of veterinary and medical importance. 

e Coronavirus replication occurs in the cytoplasm and relies on the protein synthesis machinery and membranous 
networks of the host cell. 

e Post-translational modifications are the highly regulated, enzyme-catalyzed covalent modifications of proteins 
after they are translated. 

Spike protein 

e Coronavirus spike (S) protein is modified by disulfide bond formation, N-linked glycosylation and palmitoylation. 

e Disulfide bond formation is required for the proper folding and trimerization of coronavirus S protein. 

e N-linked glycosylation contributes to the antigenicity, fusogenic and immunomodulatory activities of the S 
protein. 

e Palmitoylation in the conserved cytosolic cysteine residues is required for trafficking and folding of S protein, and 
contributes to virion assembly and infectivity. 

Envelope protein 

e Coronavirus envelope (E) protein is modified by N-linked glycosylation and palmitoylation. 

e N-linked glycosylation in the severe acute respiratory syndrome coronavirus (SARS-CoV) E protein depends on its 
membrane topologies. 

e Palmitoylation contributes to the stability and trafficking of mouse hepatitis virus (MHV) E protein, and is 
required for efficient assembly and infectivity of the virions. 

Membrane protein 

e The membrane protein of some Betacoronaviruses is O-linked glycosylated, whereas that of other coronaviruses 
is modified by N-linked glycosylation. 

e Both O-linked and N-linked glycosylation are not required for virion assembly. 

e O-linked glycosylation contributes to the induction of type | interferon and in vivo pathogenesis of MHV. 

Nucleocapside protein 

e Coronavirus nucleocapside (N) protein is modified by phosphorylation, proteolytic cleavage, ADP-ribosylation and 
sumoylation. 

e Phosphorylation of infectious bronchitis virus N protein contributes to the specificity of RNA binding, whereas 
phosphorylation of MHV N protein is involved in template read-through during genome replication/transcription. 

e Coronavirus N protein is cleaved by caspase during coronavirus-induced apoptosis. 

e ADP-ribosylated N protein can be detected in coronavirus-infected cells and in purified virions. 

Nonstructural proteins & accessory proteins 

e Coronavirus nonstructural protein 3 (nsp3) and nsp4 are multispanning transmembrane proteins modified by 
N-linked glycosylation, which may play critical functions during viral RNA synthesis and double membrane vesicle 
formation. 

e HCoV-229E nsp9 forms homodimer linked by a disulfide bond, which may affect its binding affinity to RNA. 

e SARS-CoV nsp16 interacts with and is ubiquitinated by the E3 ligase von Hippel Lindau. 

e The hemagglutinin-esterase of some Betacoronaviruses, 3a and 8ab of SARS-CoV and 3b of TGEV are accessory 
proteins modified by glycosylation. 

Interfering post-translational modifications of host proteins by coronavirus proteins 

e Coronavirus nsp3 encodes deubiquitinating and delSGylating activities, which suppress the production and 
signaling of type | interferon. 

e The SARS-unique domain in the SARS-CoV nsp3 enhances the E3 ligase activity of RCHY1 and promotes 
proteasomal degradation of p53, thereby modulating the activation of innate immune response. 

e SARS-CoV N, ORF6 and ORF9b proteins interfere with ubiquitination of cellular proteins and modulate host 
innate immunity. 

e PDCoV nsp5 mediates cleavage of NEMO and STAT2 to suppress production and signaling of type | interferon. 
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